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ABSTRACT 

Using observations of prestellar cores to infer their intrinsic properties requires the 
solution of several poorly constrained inverse problems. Here we address one of these 
problems, namely to deduce from the observed aspect ratios of prestellar cores their in- 
trinsic three-dimensional shapes. Four models are proposed, all based on the standard 
assumption that prestellar cores are ellipsoidal, and on the further assumption that 
a core's shape is not correlated with its absolute size. The first and simplest model, 
Ml, has a single free parameter, and assumes that the relative axes of a prestellar 
core are drawn randomly from a log-normal distribution with zero mean and standard 
deviation r Q . The second model, M2a, has two free parameters, and assumes that the 
log- normal distribution (with standard deviation r Q ) has a finite mean, v Q1 defined 
so that v < means elongated (or filamentary) cores are favoured, whereas v > 
means flattened (or disc-like) cores are favoured. Details of the third model (M2b, 
two free parameters) and the fourth model (M4, four free parameters) are given in the 
text. Markov chain Monte Carlo sampling and Bayesian analysis are used to map out 
the posterior probability density functions of the model paramaters, and the relative 
merits of the models are compared using Bayes factors. We show that Ml provides an 
acceptable fit to the data with r Q ~ 0.57 ± 0.06; and that, although the other models 
sometimes provide an improved fit, there is no strong justification for the introduction 
of their additional parameters. 
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1 INTRODUCTION 

There have been several previous models developed to fit 
the observed aspect ratios of cores, using both randomly 
oriented spheroids (e.g. My ers et al.| 1991 ; Ryden 1996) and 



randomly oriented ellipsoids (e.g. Goodwin et al.|2002 Jones| 
fe Basu|2002l|Tassis|2007l ). These models invoke from two to 
four free parameters. In this paper we introduce a model in 
which the the intrinisic shapes of prestellar cores are char- 
acterised by just one free parameter. Using Markov Chain 
Monte Carlo sampling (MCMC), we generate a probability 
density function (PDF) for this parameter, based on con- 
tinuum observations of the cores in Ophiuchus by Simpson 
eTIdTI (|2008t (SNW-T 08), |Stanke et aT] fl2006t (SSGK06) 
and |Motte et al.| ( [1998| ) (MAN98). We also define three more 
complex models, which, by introducing additional parame- 
ters, are better able to fit the observed distribution of pro- 
jected shapes. By means of Bayesian analysis we show that 
these more complex models are not justified. 

In Section [2] we introduce each of the models, and ex- 
plain how we derive projected shapes. In Section [3] we de- 
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scribe how we use Bayesian analysis to identify the best-fit 
model parameters, and the best models, using the observa- 
tional data. In Section [4] we present and discuss the results, 
and in Section \E\ we summarize our conclusions. 



1.1 Observational data 

We use data from three diffent surveys of dust continuum 
emission. The SNW-T08 data were obtained using SCUBA 
on the JCMT at 850 /im; 52 cores were identified, and fitted 
with ellipses at the 3a noise contour. The SSGK06 data were 
obtained using SIMBA on the SEST telescope at 1.2 mm; 



143 cores were extracted using Gaussclumps (Stutzki &; 
|Gusten|[l990) . The MAN98 data were obtained using the 
MPIfR array on the IRAM telescope at 1.3 mm; 35 cores 
were extracted, again using Gaussclumps. 



2 MODELLING THE SHAPES OF CORES 

We follow the convention that, to a first approximation, the 
projected shape of a core can be approximated by an ellipse 
(with axes a, b, a > 6, and aspect ration q = b/a); and its 
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intrinsic shape by an ellipsoid (with principal axes A, B, C, 
A> B >C). In addition, we assume that the proportions of 
a core (A : B : C) are independent of its absolute size. 

2.1 Model Ml, one free parameter (r Q ) 

The first model, Ml, has only one free parameter, r Q . The 
principal axes of a sample ellipsoid are then generated ac- 
cording to 



A = l, 

B = exp(r £ B ) , 
C = exp(T £ c ) . 



(2.1) 



Here - and in the sequel - Q B and Q c are random deviates 
drawn from a Gaussian distribution with zero mean and unit 
standard deviation. Increasing r Q increases the likelihood 
that the axes of a core have very disparate sizes, and hence 
the likelihood that the projected shape of the core has a 
small aspect ratio, q. 



z-axis along C. To observe this core from an arbitrary di- 
rection, given by polar angles (0, 0), we set 

= cos _1 (27^ - 1), (2.5) 
= 27rft , (2.6) 

where IZq and IZ^ are random deviates from a uniform dis- 
tribution on the interval (0, 1). The aspect ratio of the core 
is then given by 



a + 7- v^-7) 2 +/3 2 



where 

a (A 2 cos 2 (0) + P 2 sin 2 (0))cos 2 (6>) + C 2 sin 2 (6>), 
P = (B 2 — A 2 ) cos(#) sin(20), 
7 = A 2 sin 2 ((/>) + B 2 cos 2 (0) 

see |Binney|1985l >. 



(2.7) 



(2.8) 
(2.9) 
(2.10) 



2.2 Model M2a, two free parameters (^ 5 r o) 

The second model, M2a, has two free parameters, v and 
r Q . The principal axes of a sample ellipsoid are then gener- 
ated according to 



A=l, 

B = exp{v D +t G b 
C = exp(^ D + t Q c 



(2.2) 



If v > 0, the cores have a tendency to be disc-like; with 
r = 0, they are oblate spheroids. If v < 0, the cores have 
a tendency to be filamentary; with r =0, they are prolate 
spheroids. 



2.3 Model M2b, two free parameters (r B ,r c ) 

The third model, M2b, has two free parameters, r B and r c . 
The principal axes of a sample ellipsoid are then generated 
according to 



A = l, 

B = exp(r B g B 
C = exp(-r c £ c 



(2.3) 



2.4 Model M4, four free parameters (v B 1 r B1 iy cl r c) 



The fourth model, M4, has four free parameters, (z/ B , r B , 
v G and r c . The principal axes of a sample ellipsoid are then 
generated according to 

A=l, 

B = exp(zy B +r B a B ), (2.4) 
C = exp(zy c +r c g c ). 

2.5 Projecting an arbitraily oriented ellipsoid 

Once the principal axes of a core have been generated, they 
are re-ordered so that A>B>C. Next, without loss of gen- 
erality, we define a Cartesian co-ordinate system in which 
the £-axis is aligned along A, the y-axis along B, and the 



3 BAYESIAN ANALYSIS 

We use Bayesian analysis to determine the best-fit param- 
eters of the different models, and to quantify their rela- 
tive strengths. When comparing model M with parameters 
x = (xi,X2,...) against observational data D, Bayes' theo- 
rem states that 



P(x|M,D) 



P(D|M,x)P(x|M) 
P(D|M) 



(3.1) 



Here P(x|M,D) is the posterior probability of x given D, 
P(D|M,x) is the likelihood of D given x, P(x|M) is the 
prior PDF of x and P(D|M) is the marginal likelihood over 
all values of x, i.e. 

P(D|M) = f P(D|M,x)P(x|M)dx. (3.2) 

J X 



As P(D|M) is a constant, equation ( |3.1[ ) simplifies to 

P(x|M, D) oc P(D|M, x)P(x|M) , (3.3) 

where any generated posterior PDFs can be normalized to 
unity, post analysis. 

3.1 Prior PDF 

When generating prior PDFs for the model parameters x 
we assume that P(x|M) is finite and uniform within given 
limits, and zero outside them. This is to say, within credible 
limits, we impose no a priori preference for any specific x. 

For Ml, the single parameter r Q must be able to repro- 
duce the maximum and minimum observed aspect ratios in 
the data, viz. g MAX — 1 and <? MIN — 0.3 (over all three data 
sets, there are only two cores with g<0.3). Since the major- 
ity of aspect ratios delivered by Ml satisfy ^>exp(— r Q ), 
we set -m(<? MAX )<T <-m(g MIN ), i.e. 0<r o <1.2. 

For M2a we set the range of v to — 1.2 < v < 1.2 
so that a purely oblate or prolate population (i.e. one with 
t =0) could reproduce the observed aspect ratios. We then 
assign r Q the same range as in Ml. 

For both M2b and M4 we assign r B and r c the same 
range as r Q in Ml. For M4 we assign z/ B and v c the same 
range as v in M2a. 
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With these ranges, the normalised prior PDFs are 
P(r |Ml)=| <p) 

P(z. ,T |M2a) = 



P(r B , r c |M2b) = 



^B^C» T B> r cl M 4) 




1.2<i/ <1.2 



4(1.2)^ 



if 0<r o <1.2, 
otherwise; 

if 

and 0<r o <1.2, 
otherwise; 

if 0<r B <1.2 
and 0<r c <1.2, 
otherwise; 

if -1.2<i/ B <1.2 
and -1.2<i/ c <1 
and 0<r B <1.2 
and 0<r c <1.2, 
otherwise . 



(3.4) 



3.2 Markov chain Monte Carlo sampling 



For each observational data set, D, we generate a histogram 
of aspect ratios. The histogram has ten bins (k = ltolO), 
evenly spaced between q = and q = 1, and Ok is the 
number of observed cores in bin k. 

For a given model, M, and a given choice of the asso- 
ciated free parameters, Xi, we generate 10 4 ellipsoids, and 
view each one from an arbitrary direction to determine its 
aspect ratio, q, as described in Section [2] The resulting q- 
values are then used to construct an equivalent histogram 
of expectation values, Ej, (j = ltolO), normalised so that 
^2j{Ej} = ^2j{Oj}. The likelihood of the observational 
data, D, being reproduced by M,Xi is then 



P(D|M,Xi) exp 



j = 10 

E 



20, 



(3.5) 



We have assumed purely Poisson errors on the counts in each 
bin, because error estimates for individual g- values are not 
available. Bins that have less than five counts are pooled to- 
gether so that the Gaussian approximation to Poisson errors 
is valid. 

To build a Markov Chain, we consider the observational 
values, Ofc, from a particular data set, D, and we invoke a 
particular model, M. We pick a set of model parameters 
(xo) in the middle of the ranges defined in Section [3. 1| and 
compute P(D|M,x ), as described in the preceding para- 
graph. We then build the chain by stepping from one set of 
model parameters to another, xo -> xi —> X2 — > X3.... Each 
step, Ax = Xi+i — Xi is drawn randomly from a Gaussian 
distribution centred on zero. The step is only made if 



P(D|M,x i+1 ) 
P(D|M, Xi ) 



(3.6) 



where 7^ STEP is a random deviate from a uniform distribu- 
tion on the interval (0,1). Otherwise the step is rejected and 
a new step is drawn; this ensures that the points on the chain 
tend to concentrate in regions of high probability. The coef- 
ficients regulating the mean step size should be adjusted so 
that roughly half the steps are rejected. The first 10 3 points 
on the chain are discarded, to remove any memory of the 
starting point. The subsequent 5 x 10 5 points are used to 
identify the best-fit parameters and their uncertainties. 



14.00 
° 10.50 

Q 7.00 



£ 3.50 

ST 



14.00 



g 10.50 




7.00 



vb 3.50 



14.00 



I 10.50 



7.00 



3.50 



0.00 




0.6 



Figure 1. Posterior PDFs for r Q 
SSGK06 and MAN98 data. 



in Ml, from the SNW-T08, 



We have built a Markov Chain for each possible combi- 
nation of the four models and the three data sets. The points 
on the chain are then used to determine the posterior PDFs 
of the model parameters. The results are presented in Figs, 
[l] [2] [3] and [4] The best fits obtained with Ml and M2a are 
compared with the observations in Figs. [5] and [6] 

3.3 Model selection 

Bayesian analysis can also be used to compare different mod- 
els. Given a list of competing models, Mi, M2, . . . , M n , the 
probability of a particular model, Mk, is 

P(D|M k )P(M k ) 



P(M k |D) 



P(D) 



where 



P(D) = ^P(D|M k )P(M k ) 



(3.7) 



(3.8) 



To calculate P(D|M k ) we must marginalise each model's 
likelihood over its associated parameter space (see Eqn. 3.2 ). 



We evaluate this integral by organising the points on the as- 
sociated Markov Chain into a balanced binary tree ( |Wein-| 
berg 2009). This has the effect of dividing the parameter 



space into cells, each of which contains a single point. Each 
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Figure 2. Posterior PDFs for v and r Q in M2a, from the SNW- 
T08, SSGK06 and MAN98 data. 



point, Xi, now has a likelihood (see Eqn. 3.5) and a volume 
of parameter space, 5V\ equal to the volume of the cell it oc- 
cupies. Hence the marginalised likelihood is approximated 
by 



i=JV 

P(D|M k ) -yY, p ( D l M k, Xi) SVi . 



(3.9) 



Here N is the number of points on the Markov Chain and V 
is the total volume of parameter space associated with model 
Mk- As MCMC sampling is most noisy around the edges 



Figure 3. Posterior PDFs for r B and r c in M2b, from the SNW- 
T08, SSGK06 and MAN98 data. The white dashed line represents 
r B = t c , about which the distribution should be symmetric. 



of the distribution, we omit from the summation any cells 
that extend to the boundaries of the parameter space. These 
regions are undersampled and have disproportionately large 
cells; including them generally overestimates P(D|Mk). 

The relative likelihood of one model, k, with respect to 
another, k 7 , is quantified by the Bayes factor 

P(M k |D) _ P(D|M k )P(M k ) 



kk ' P(M k ,|D) P(D|M k ,)P(M k ,) ■ ^ Uj 
Given that we have no a priori preference for either model, 
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4 RESULTS 
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the SNW-T08, SSGK06 and MAN98 data. r B and r c 
have been marginalized out, i.e. P(^ B ,z^ c |M4,D) = 
f f P(v B , ^ c , t b , r c |M4, D) dr B dr c . The white dashed line 
represents v B = u c , about which the distribution should be 
symmetric. 



i.e. P(Mk) = P(Mk'), equation (3.10) reduces to the ratio 



of the marginalised likelihoods. Bayes factors quantifying the 
relative performance of the different models are presented in 
Tabled]. 



4.1 Parameter estimation for Ml 

Fig. ^ shows the posterior PDFs for r Q in Ml, based on 
the different data sets. Since the PDFs are all unimodal 
and not overly skewed, we can calculate means and stan- 
dard deviations, viz. r Q = 0.51 ± 0.06 (SNW-T09), 0.56 ± 
0.03 (SSGK06), 0.63 ± 0.08 (MAN98). These values can be 
combined to give r Q =0.57 =b 0.07, i.e. the principal axes of 
a core typically differ by a factor of order exp(r )~1.7. 

Figure [5] compares the observed distributions of aspect 
ratio from the different data sets with the best fits from 
Ml. Ml fits the SSGK06 data (which, with 142 cores, has 
the least noisy statistics) well. The fits to the SNW-T09 data 
(52 cores) and the MAN98 data (35 cores) are less good. For 
example, the MAN98 data hints at a sharp peak between q — 
0.5 and g = 0.6 which Ml is unable to reproduce; however, 
this may just be the product of small-number statistics. 



4.2 Parameter estimation for M2a 

Fig.[2]shows the posterior PDFs of v and r Q in M2a, based 
on the different data sets. We recall that v determines 
whether cores have a tendency to be filamentary (y < 0) 
or disc-like (is Q > 0). For all three data sets there is a de- 
generacy, because the intrinsic asymmetry of the cores is 
promoted both by increasing r , and by increasing |z/ Q |. 
Consequently solutions with reduced r Q and increased | v \ 
have high probability. Indeed, for the SSGK06 and MAN98 
data sets these are actually the prefered solutions. However, 
in neither case is there a clear preference for filamentary over 
disc-like cores, or vice versa. 

Figure [6] compares the observed distributions of aspect 
ratio with the best fits from M2a. M2a delivers a markedly 
better fit - than Ml - to the SSGK06 and MAN98 data 
sets, irrespective of whether we use the filamentary or disc- 
like parameters. However, the best fit to the SNW-T09 data 
set is not much better than with Ml. 



4.3 Parameter estimation for M2b 

Figure [3] shows the posterior PDFs of r B and r c in M2b, 
based on the different data sets. For the SSGK06 data we 
find a peak at r B — t c ~ 0.55, and for the MAN98 data at 
t b ~r c -0.60. For the SNW-T09 data, the distribution of 
t b and t c is somewhat broader, but nowhere does it exceed 
that for r B ~ r c ~ 0.50. Thus, for all three data sets, the 
two parameters of M2b do not furnish a better fit than the 
single parameter of Ml. 



4.4 Parameter estimation for M4 

Figure [4] shows the posterior PDFs of is B and v c in M4, 
based on the different data sets; for simplicity, we have 
marginalized r B and r c out of the PDFs. For all three data 
sets, the highest probabilities lie on the line v B ~ v c , which 
suggests that M4 is unable to improve on M2a. There are 
also regions of moderately high probability where z/ B / v G , 
but these are outweighed by the regions where is B c± v c , and 
there is no hint of a preference for filamentary over disc-like 
shapes, or vice versa. 
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Figure 5. The histograms represent the distributions of aspect 
ratio obtained by SSGK06 (top), MAN98 (middle) and SNW-T08 
(bottom), with errors. The dashed lines represent the best 
fits obtained with Ml. 



4.5 Model selection 

We quantify the quality of the different models, for the dif- 
ferent data sets, by calculating Bayes factors, i^kk' 5 as de- 
scribed in Section [33] The results are presented in Table [I] 
where K > 1 indicates a preference for the model denoted in 
the column header, and K < 1 indicates a preference for the 



model in the row label. Jeffreys ( 1961 ) suggests the following 



qualitative interpretation for different values of i^kk' 



i^kk' < 1/10 
1/10 <i^kk' < 1/3 
1/3 <K^, < 1 



Strongly supports Mk' 
Moderately supports Mk/ 
Weakly supports Mk' 
No preference 
Weakly supports Mk 
Moderately supports Mk 
Strongly supports Mk 

we stress that these categories are only intended to be in- 
dicative. 

The SNW-T09 data are much better fitted by Ml or 
M2b, than by M2a or M4; there is little to chose between 
Ml and M2b. Conversely, the SSGK06 and MAN98 data 
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Figure 6. As Fig. [5] but for M2a. 



sets are both fitted best by M2a, with Ml also giving a 
good fit, and M2b and M4 giving relatively poor fits. To 
combine the data sets, we have simply taken the products of 
their individual Bayes factors, and these are given in the last 
panel of Table [I] These values suggest that Ml is the best 
model. M2a is almost as good, and should remain in the 
reckoning against the day when sufficient data is available 
to distinguish between filamentary and disc-like cores. 

Since our analysis has not included the errors on in- 
dividual data points (they are not available), the Poisson 
errors in Figs. [5] and [6] and in Eqn. ( |3.5[ ), should be larger. 
This would broaden the posterior PDFs for all models, but 
the effect would tend to be larger for models with more 
free parameters, in the sense that the probability would be 
smeared over more dimensions, and therefore their marginal 
likelihoods would be reduced more. Since we have already 
concluded that Ml performs best, we infer that this con- 
clusion would be reinforced if observational errors were in- 
cluded. 



5 CONCLUSIONS 

We have used Bayesian analysis to infer the intrinsic shapes 
of prestellar cores in Ophiuchus. We find that the observa- 
tional data is well fitted with a one-parameter model, Ml, 
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Table 1. Bayes factors, K = P(M COLUMN |D)/P(M ROW |D), cal- 
culated using Eqn. |3.10| The first three panels give values for the 
individual data sets, and the fourth panel gives their product. 

in which cores are triaxial ellipsoids with axes chosen from a 
log-normal distribution having zero mean and standard de- 
viation t ~ 0.57 =b 0.07. The two-parameter models, M2a 
and M2b, do not sufficiently improve the fit to justify their 
adoption, and the four-parameter model, M4 is completely 
unjustified. 
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